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Abstract  -  In  this  paper  we  consider  the  use 
of  Covariance  Union  (CU)  with  multi-hypothesis 
techniques  (MHT)  and  Gaussian  Mixture  Models 
( GMMs )  to  generalize  the  conventional  mean  and 
covariance  representation  of  information.  More 
specifically,  we  address  the  representation  of  multi¬ 
modal  information  using  multiple  mean  and  covari¬ 
ance  estimates.  A  significant  challenge  is  to  define 
a  rigorous  fusion  algorithm  that  can  bound  the  com¬ 
plexity  of  the  filtering  process.  This  requires  a  mech¬ 
anism  for  subsuming  subsets  of  modes  into  single 
modes  so  that  the  complexity  of  the  representation 
satisfies  a  specified  upper  bound.  We  discuss  how 
this  can  be  accomplished  using  CU.  The  practical 
challenge  is  to  develop  efficient  implementations  of 
the  CU  algorithm.  Because  of  the  novelty  of  the  CU 
algorithm,  there  are  no  existing  real-time  implemen¬ 
tations  for  use  in  real  applications.  In  this  paper 
we  address  this  deficiency  by  considering  a  general- 
purpose  implementation  of  the  CU  algorithm  based 
on  general  nonlinear  optimization  techniques.  Com¬ 
putational  results  are  reported. 

Keywords:  Covariance  Intersection,  Covariance  Union, 
Data  Fusion,  Kalman  Filter,  Multimodal  distributions. 

1  Introduction 

Level- 1  information  management  has  matured  signif¬ 
icantly  over  the  last  decade  with  the  development  of 
rigorous  algorithms  that  are  robust  to  the  effects  of 
unmodeled  correlations  and  corrupt  and/or  spurious 
information  in  the  context  of  general  distributed  data 
fusion  networks.  Despite  the  dramatic  theoretical  and 
practical  results  in  the  Level- 1  arena,  there  have  been 
very  few  inroads  made  into  higher  level  information 
management  applications.  This  is  due  in  large  mea¬ 
sure  to  the  discrepancy  between  the  relatively  simple 
types  of  information  encountered  in  low  level  tracking 
and  control  applications  and  the  much  more  varied  and 
richer  forms  of  information  that  must  be  processed  in 
high  level  applications. 

In  this  paper  we  explore  a  methodology  for  general¬ 
izing  the  unimodal  information  representation  scheme 
used  in  Level- 1  contexts  to  permit  the  representation  of 
information  that  has  a  more  complicated  multimodal 
structure.  This  is  accomplished  by  the  use  of  a  set  of 


unimodal  state  estimates  to  capture  the  multiplicity  of 
possible  states  of  the  target  of  interest.  The  challenge 
is  to  be  able  to  bound  the  computational  complexity 
issues  that  arise  from  this  approach.  In  this  paper 
we  describe  how  a  mechanism  called  Covariance  Union 
(CU)  [5,  2]  can  be  applied  to  reduce  the  complexity  of  a 
multimodal  representation  to  satisfy  a  fixed  complex¬ 
ity  budget  while  rigorously  guaranteeing  information 
integrity. 

The  structure  of  the  paper  is  as  follows:  Section  1 
discusses  the  issue  of  information  representation.  Sec¬ 
tion  2  discusses  the  need  for  an  information  compres¬ 
sion  mechanism  to  bound  the  computational  complex¬ 
ity  of  the  fusion  process.  CU  is  shown  to  be  a  solution 
to  this  problem.  Section  3  discusses  computational  is¬ 
sues  that  must  be  addressed  in  order  for  CU  to  be  ap¬ 
plied  in  practice.  Practical  algorithms  for  implement¬ 
ing  CU  are  described.  Section  5  provides  experimental 
results  demonstrating  the  application  of  CU.  Section  6 
discusses  the  results  presented  in  the  paper. 

2  Information  Representation 

Determining  how  to  represent  information  and  uncer¬ 
tainty  is  a  key  first  step  that  impacts  all  aspects  of  the 
data  fusion  problem.  The  representation  must  provide 
both  an  estimate  of  the  state  of  the  target  or  system 
of  interest  and  its  associated  degree  of  error  or  uncer¬ 
tainty,  and  the  uncertainty  must  be  defined  in  a  form 
that  permits  it  to  be  empirically  determined.  There 
must  be  a  rigorous  algorithm  for  fusing  information  in 
the  representation,  and  the  computational  complexity 
of  the  representation  and  its  associated  fusion  algo¬ 
rithms  must  be  bounded  for  practical  application. 

By  far  the  most  widely  used  information  represen¬ 
tation  is  the  mean  and  covariance  form,  where  the 
mean  vector  defines  the  best  estimate  of  the  state  of 
the  target  and  the  error  covariance  provides  an  up¬ 
per  bound  on  the  expected  squared  error  associated 
with  the  mean.  For  example,  the  measured  position 
of  an  object  in  two  dimensions  can  be  represented  as 
a  vector  a  consisting  of  the  object’s  estimated  mean 
position,  e.g.,  a  =  [x,  y]T,  and  an  error  covariance  ma¬ 
trix  A  that  expresses  the  uncertainty  associated  with 
the  estimated  mean.  If  the  error  in  the  estimated  mean 
vector  is  denoted  as  a,  then  the  error  covariance  matrix 
is  an  estimate  of  the  expected  squared  error,  E[aaT]. 
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The  estimate  is  said  to  be  consistent  (or  conservative) 
if  and  only  if  A  >  E[aaT]  or,  equivalently,  A  -  E[aaT] 
is  positive  definite  or  semidefinite  (i.e.,  has  no  nega¬ 
tive  eigenvalues).  The  full  estimate  of  a  target’s  state 
is  given  by  the  mean  and  covariance  pair  (a,  A). 

Given  two  mean  and  covariance  estimates  (a,  A) 
and  (b,B),  the  data  fusion  problem  consists  of  deter¬ 
mining  a  fused  estimate  (c,  C)  that  is  guaranteed  to 
be  consistent  and  summarizes  the  information  in  the 
two  estimates  with  error  (in  terms  of  the  size  of  C) 
that  is  less  than  or  equal  to  that  of  either  estimate.  If 
the  two  estimates  are  consistent  and  have  a  precisely 
known  degree  of  correlation,  the  Kalman  filter  can  be 
applied;  otherwise,  Covariance  Intersection  (Cl)  must 
be  used.  Both  algorithms  yield  guaranteed  consistent 
results  when  used  appropriately.  The  limitations  of 
the  mean  and  covariance  representation  of  information 
can  be  found  in  a  variety  of  practical  contexts.  For  ex¬ 
ample,  suppose  a  vehicle  is  being  tracked  along  a  road 
in  an  urban  environment.  Assuming  that  it  travels  at 
a  speed  that  is  average  for  the  road,  its  future  posi¬ 
tion  can  be  predicted  forward  a  short  length  of  time 
reasonably  accurately;  however,  if  it  encounters  a  T- 
junction  at  which  it  must  turn  left  or  right,  there  are 
two  distinct  possible  future  positions.  The  future  state 
can  be  represented  with  a  single  mean  and  covariance 
estimate,  but  doing  so  requires  establishing  a  mean  po¬ 
sition  at  the  junction  with  a  covariance  large  enough  to 
account  for  its  position  after  a  left  or  right  turn.  This 
produces  a  clearly  unsatisfactory  result  in  which  the 
mean  vector  does  not  correspond  to  either  of  the  pos¬ 
sible  states  of  the  vehicle  and  consequently  has  a  very 
large  error  covariance.  Intuitively  it  seems  clear  that  a 
better  option  would  be  to  maintain  information  about 
the  two  possible  future  states  rather  than  subsuming 
them  into  a  single  mean  and  covariance  estimate. 

Historically  there  have  been  two  distinct  approaches 
for  representing  “multimodal”  information  (e.g.,  as  in 
the  above  example).  One  involves  Multiple  Hypothe¬ 
sis  Tracking  (MHT),  which  maintains  multiple  mean 
covariance  estimates  corresponding  to  distinct  possi¬ 
ble  states  [1].  The  other  approach  is  to  attempt  to 
maintain  a  parameterization  of  the  Probability  Density 
Function  (PDF)  that  defines  the  uncertainty  distribu¬ 
tion  associated  with  state  of  the  target.  In  practice, 
PDF  approximation  methods  typically  only  represent 
the  significant  modes  of  the  distribution  in  terms  of 
their  means  and  covariances,  thus  making  its  repre¬ 
sentation  all  but  identical  to  MHT.  A  key  distinction 
is  that  the  PDF-based  approach  treats  the  set  of  es¬ 
timates  as  defining  a  union  of  Gaussian  probability 
distributions.  More  specifically,  the  distribution  is  ex¬ 
pressed  as  a  Gaussian  Mixture  Model  (GMM)  of  the 
form: 

N 

p(x)  =  'YjPiN {*; /Li,  P,}  (1) 

i— 1 

where  the  itli  mode  has  mean  /j, ,  covariance  P,  and 
weight  pi .  The  weights  are  all  non-negative  and  sum 
to  one. 

The  reason  for  adopting  this  form  is  that  GMMs 
can  conveniently  approximate  a  wide  class  of  PDFs 


and  are  identical  in  implementation  to  MHT.  Unfortu¬ 
nately,  representation  is  only  one  aspect  of  the  overall 
information  management  problem.  There  also  must  be 
tractable  algorithms  for  fusing  information  in  a  given 
representation. 

The  fusion  of  a  set  S  of  mean  and  covariance  es¬ 
timates,  each  defining  a  possible  state  of  the  target 
only  one  of  which  is  guaranteed  to  be  consistent,  with 
another  set  T  can  be  accomplished  under  the  MHT  in¬ 
terpretation  simply  by  forming  the  Cartesian  product 
S  x  T  and  applying  the  appropriate  fusion  algorithm 
(Kalman  or  Cl)  to  the  pairs.  Unfortunately,  this  yields 
a  combined  estimate  that  has  0(|Sj*|Tj),  which  implies 
that  the  complexity  of  the  fused  estimate  exceeds  that 
of  the  original  estimates.  This  increasing  complexity 
will  tend  to  exhaust  available  resources  and  therefore 
must  be  mitigated. 

3  Representation  Compression 

One  of  the  most  important  features  of  the  mean  and 
covariance  representation  of  information  is  its  constant 
complexity.  Specifically,  the  amount  of  information 
required  to  describe  the  state  of  the  target  does  not 
increase  as  new  information  is  incorporated.  However, 
when  the  representation  of  state  is  generalized  to  main¬ 
tain  more  than  one  mean  and  covariance  estimate,  cor¬ 
responding  to  different  modes,  the  update/fusion  oper¬ 
ation  multiplies  the  number  of  modes.  In  order  to  man¬ 
age  the  complexity  of  the  representation  some  form  of 
representation  compression  must  be  applied. 

In  most  MHT  applications,  the  proliferation  of  hy¬ 
potheses  is  managed  by  pruning  the  least  likely  ones 
according  to  some  measure.  A  practical  problem  with 
pruning  is  that  the  likelihood  measure  typically  in¬ 
cludes  many  assumptions  (e.g.,  PDF-related)  that  lead 
to  more  loss  of  correct  hypotheses  than  is  expected,  and 
any  loss  of  the  hypothesis  that  corresponds  to  the  true 
state  of  the  target  undermines  the  rigor  of  the  entire 
information  management  framework.  Therefore,  prun¬ 
ing  cannot  be  the  primary  mechanism  for  the  limiting 
the  representational  complexity  of  our  multimodal  es¬ 
timates. 

If  it  is  not  possible  to  prune  estimates  (discard 
modes),  then  the  only  alternative  is  to  somehow  co¬ 
alesce  similar  modes  to  stay  within  a  fixed  represen¬ 
tational  complexity  budget.  The  key  question  is  how 
to  perform  this  coalescing  so  that  the  integrity  of  the 
information  is  maintained.  If  it  is  assumed  that  one  of 
mode  of  an  estimate  corresponds  to  the  true  state  of 
the  target,  and  the  others  are  spurious,  then  a  mech¬ 
anism  called  Covariance  Union  (CU)  can  be  applied. 
For  example,  given  n  modes  represented  by  estimates 
(ai,  Ai)  . . .  (a„,  A„),  CU  produces  an  estimate  (u,  U) 
that  is  guaranteed  to  be  consistent  as  long  one  of  the 
mode  estimates  (a*,  Aj)  is  consistent.  This  is  achieved 
by  guaranteeing  that  the  estimate  (u,  U)  is  consistent 
with  respect  to  each  of  the  estimates: 

U  >  Ai  +  (u  -  ai)(u  -  ax)T  (2) 

U  >  A2  +  (u  -  a2)(u  -  a2)T  (3) 


:  (4) 

U  >  A„  +  (u  —  a„)(u  —  a„)T  (5) 

where  some  measure  of  the  size  of  U,  e.g.,  determinant, 
is  minimized.  The  consistency  of  the  CU  estimate  is 
assured  for  each  of  the  n  inequalities  because  the  dif¬ 
ference  between  the  mean  u  and  a j  is  accounted  for  in 
the  covariance  U  by  the  addition  of  the  square  of  that 
difference  to  the  covariance  A,;. 

Given  a  complexity  budget  of  TV  modes,  the  the  fu¬ 
sion  of  two  N-mode  estimates  will  produce  a  new  es¬ 
timate  with  TV 2  modes  which  must  be  reduced  to  TV 
modes.  This  can  be  achieved  by  applying  a  clustering 
algorithm  (e.g.,  standard  k- means  clustering  based  on 
a  covariance- weighted  distance  measure  such  as  Maha- 
lanobis) .  Each  of  the  TV  clusters  can  be  combined  into 
a  single  mean  and  covariance  estimate  using  CU,  and 
the  rigor  of  the  framework  is  guaranteed  because  one 
of  the  TV  estimates  will  be  consistent  as  long  as  one  of 
the  original  TV 2  estimates  was  consistent. 

This  application  of  CU  for  mode  reduction  is  appro¬ 
priate  for  MHT-type  applications.  However,  CU  must 
be  generalized  to  accommodate  weights/probabilities 
associated  with  modes  when  the  representation  is  in¬ 
terpreted  to  be  a  Gaussian  mixture  approximation  of 
a  multimodal  probability  distribution.  This  requires  a 
generalization  of  the  definition  of  consistency  for  mul¬ 
timodal  estimates.  We  require  that  each  probability 
Pi  be  greater  than  or  equal  to  the  actual  probability 
that  estimate/mode  i  corresponds  to  the  true  state  of 
the  target.  The  problem  is  that  any  small  but  nonzero 
probability  implies  that  the  associated  estimate  may 
represent  the  true  state  of  the  target,  so  consistency 
requires  it  to  have  the  same  influence  on  the  CU  re¬ 
sult  as  an  estimate  with  a  much  higher  probability. 
The  only  difference  is  that  the  final  result  can  be  in¬ 
terpreted  as  having  an  associated  probability  that  is 
equal  to  min(l,  JTpj),  where  the  min  function  is  re¬ 
quired  because  the  weights  are  assumed  to  be  conser¬ 
vative  and  thus  may  sum  to  a  value  greater  than  unity. 
Thus,  the  MHT  case  is  equivalent  to  having  no  prob¬ 
ability  estimates,  which  requires  unity  to  be  assumed 
for  every  mode. 

4  Computational  Methods 

Unlike  Covariance  Intersection,  for  which  efficient 
semidefinite  matrix  optimization  methods  can  be  ap¬ 
plied,  Covariance  Union  involves  inequalities  with 
terms  that  depend  on  the  means  of  the  estimates.  This 
dependency  on  the  means  requires  a  more  sophisticated 
variant  of  the  methods  that  are  applied  for  straight 
semidefinite  matrix  equations.  For  our  experiments, 
however,  we  have  applied  simpler  generic  optimization 
methods,  which  are  discussed  in  this  section. 

The  optimization  problem’s  feasible  region  is  the  in¬ 
tersection  of  a  set  of  inequalities,  each  of  which  can  be 
written  as  a  linear  matrix  inequality  in  u  and  U: 

'  (U  -  Afc)  (u  -  afe)  1 

[(u-a,f  1  U° 


The  intersection  of  all  of  the  constraints  can  then 
be  represented  as  a  larger  block-diagonal  inequality 
in  which  the  diagonal  elements  are  the  LMI’s  shown 
above.  This  defines  a  region  which  is  convex  but  non¬ 
smooth.  The  fact  that  the  constraints  are  nonsmooth 
rules  out  most  commonly  available  high-performance 
optimization  packages  since  they  typically  expect  the 
objective  and  constraint  functions  to  be  twice  contin¬ 
uously  differentiable. 

The  trace  measure  is  linear  and  so  can  be  posed  as 
a  standard  SDP  problem  [6].  There  is  no  such  formu¬ 
lation  for  other  measures  such  as  determinant  or  the 
Frobenius  norm,  so  a  general-purpose  nonlinear  opti¬ 
mizer  such  as  SolvOpt  [3]  must  be  used  to  handle  ar¬ 
bitrary  norms.  SolvOpt  is  an  implementation  of  Shor’s 
r-algorithm  [4],  The  initial  feasible  solution  is  gen¬ 
erated  by  simply  setting  u  to  zero  and  summing  the 
right-hand  sides  of  the  simplified  constraints: 

u0  =  0  (7) 

n 

Uo  =  ^2  (Aj.  +  afcafc )  (8) 

fc=i 

We  have  developed  several  approximate  solutions 
that  can  also  be  applied  which  are  much  faster  while 
still  preserving  consistency.  These  methods  are  suit¬ 
able  for  real-time  use  and  could  also  be  used  to  gen¬ 
erate  better  starting  points  for  iterative  improvement. 
Most  of  them  rely  on  separation  of  the  u  and  U  op¬ 
timizations  to  achieve  computational  savings.  If  the 
u  vector  is  fixed  at  a  specific  value  then  the  problem 
is  considerably  simplified:  find  a  minimal  U  such  that 
U  >  Ffc  where  the  Ffc  are  constant.  This  simpler  prob¬ 
lem  can  yield  closed-form  solutions  when  there  are  only 
two  estimates  to  be  combined.  For  example,  if  deter¬ 
minant  is  the  measure  used  then  the  resulting  U  can  be 
computed  directly  via  simultaneous  diagonalization: 

U  =  ( VT)“ 1  max(VTAV,  VrBV)  V"1  (9) 

where  max  is  the  component-wise  maximum  of  two  di¬ 
agonal  matrices.  V  contains  the  generalized  eigenvec¬ 
tors  of  A  and  B.  Using  Matlab  it  would  be  computed 
as  [V,  D]  =  eig  (A,  B). 

One  such  approximation  is  to  assume  that  real-life 
applications  produce  estimates  in  which  the  optimal 
mean  u  can  be  modeled  as  a  convex  combination  of  the 
input  means.  This  constrains  u  to  a  bounded  region 
in  Rra.  Indeed  if  there  are  only  two  estimates  (a,  A) 
and  (b,B)  to  be  unioned  then  u  is  constrained  to  the 
line  segment  between  a  and  b: 

Let  c  =  b  —  a,  u  =  a  +  ojc.  The  convex  combination 
problem  can  then  be  stated  as: 

Find  a  minimal  U  such  that: 

U  >  A  +  oj2cct  (10) 

U  >  B  +  (1  —  to)2  ccT  (11) 

This  can  easily  be  solved  via  any  number  of  sim¬ 
ple  one-dimensional  search  techniques,  using  the  pre¬ 
viously  noted  formulae  to  compute  U  for  a  fixed  value 

of  u. 


(6) 


In  our  experiments  it  has  been  observed  that 
convex-combination  CU  produces  reasonably  good  ap¬ 
proximations  to  the  optimal  values  when  applied  to  two 
estimates  in  low  dimensions.  However,  its  performance 
has  not  yet  been  fully  characterized.  It  was  evaluated 
using  a  determinant  on  pairs  of  estimates  whose  mean 
components  and  covariance  eigenvalues  were  randomly 
chosen  on  the  interval  (0, 1)  and  the  dimensionality  n 
varied  from  2  to  20.  For  the  two-dimensional  data  the 
determinant  of  U  produced  by  the  convex-combination 
CU  averaged  only  4%  larger  than  the  optimal  value. 
However,  for  n  =  20  it  was  20%  larger.  So  its  per¬ 
formance  degraded  as  n  was  increased  (as  could  be 
expected  from  the  definition  of  determinant  and  the 
method  used  to  construct  the  test  set)  but  the  increase 
appeared  to  be  only  proportional  to  yfn. 

Another  fast  real-time  approximation  can  be  de¬ 
rived  by  noting  that  the  optimal  two-element  CU  up¬ 
date  tends  to  produce  a  u  vector  for  which  the  two  con¬ 
straints  are  similar  in  size  and  shape.  In  other  words, 
it  has  a  tendency  to  select  a  u  vector  for  which: 

A  +  (u  —  a)  (u  —  a)T  ss  B  +  (u  —  b)  (u  —  b)T  (12) 

This  observation  suggests  a  strategy  in  which  u  is 
fixed  at  the  point  where  the  difference  is  minimized.  If 
the  Frobenius  norm  of  the  difference  is  minimized  then 
it  leads  to  a  closed-form  solution  for  u: 

u  =  +  b  +  ((cTc)  I  +  ccT)  1  (A  —  B)  /2  (13) 

c  =  (a  —  b)  /2  (14) 

This  solution  has  only  been  tested  with  random  data. 
It  produces  good  estimates  when  the  differences  be¬ 
tween  the  estimates’  means  is  large  compared  to  the 
differences  between  the  estimates’  covariance  matrices. 

Large  problems  with  many  estimates  can  be  broken 
down  into  a  set  of  smaller  problems  by  recursively  solv¬ 
ing  two  estimates  at  a  time.  For  example,  if  there  are 
three  estimates  (ai,Ai),  (a2,A2),  and  (a3,A3)  they 
can  be  separated  into  two  smaller  problems: 

1.  Compute  (u!,Ui)  as  the  union  of  (a3,Ai)  and 
(a2,  A2). 

2.  Compute  (u2,U2)  as  the  union  of  (ui,Ui)  and 

(a3,  A3). 

3.  (u2,U2)  is  the  solution. 

The  main  advantage  of  this  approach  is  that  two- 
element  unions  can  be  solved  quickly  via  convex  com¬ 
bination  CU  using  closed-form  formulas  described  ear¬ 
lier.  But  the  method  has  one  serious  drawback  that  is 
illustrated  in  Figure  1:  it  does  not  guarantee  consis¬ 
tency.  It  does  guarantee  that  the  covariance  matrices 
Ufc  will  never  shrink  and  will  most  likely  grow  on  every 
iteration,  but  there  is  no  guarantee  that  when  Ufc+i  is 
re-centered  at  a  new  mean  u^+1  that  it  will  still  be 
consistent  with  the  earlier  estimates.  Previous  experi¬ 
ments  did  not  observe  this  effect  due  to  the  extra  slack 
provided  by  the  convex-combination  formulation.  The 
solution  can  be  expected  to  be  consistent  as  long  as 


Figure  1:  An  example  illustrating  inconsistency  with 
recursively  applying  the  two-element  unions.  The 
means  and  la  ellipses  of  the  input  set  (a,  A)  are  shown 
as  the  set  of  thin  solid  ellipses  with  their  means  at  +. 
The  batch  CU  estimate  is  the  thick  solid  ellipse  with 
the  mean  at  o.  The  pairwise  CU  is  the  thick,  dashed 
ellipse  with  its  mean  at  x .  A  necessary  condition  for 
(u,  U)  to  be  consistent  is  that  all  of  the  input  means 
should  lie  within  the  ler  covariance  ellipse.  However, 
many  of  the  means  for  the  input  set  lie  outside  for  the 
pairwise  fused  result. 

the  errors/biases  in  the  combined  estimates  are  statis¬ 
tically  independent.  The  CU  equations  can  be  easily 
generalized  to  account  for  potentially  correlated  biases 
in  the  means1. 

4.1  Implementation 

The  SolvOpt  package  is  able  to  find  a  minimizing  vec¬ 
tor  x  according  to  a  cost  function  /  [x) ,  which  may 
be  optionally  constrained  by  some  function  g(x).  We 
choose  x  to  be  the  n  elements  of  u  plus  the  n(n+1') 
elements  of  the  upper  triangle  of  U. 

We  minimize  the  determinant  of  the  covariance,  U, 
subject  to  the  constraint  that 

Xfc  =  U  —  Afc  —  (u  —  afc)  (u  -  afc)T  (15) 

has  non-negative  eigenvalues,  for  all  k  £  [l,...,m], 
where  m  is  the  number  of  estimates  given. 

To  find  |U|,  we  perform  an  LU  decomposition  of 
matrix  U,  to  generate  an  upper  triangular  matrix  W 
and  a  lower  triangular  matrix  L,  such  that  LW  =  U. 
L  and  W  are  given  by 

L«  =  1  (16) 

)  ;  *  >  j  (17) 

k= 1  / 

lrThe  case  of  common  bias  terms  just  requires  an  additional 
parameter  a*  per  estimate:  U  >  Ai/ai  +  (u  —  ai)(u  —  a.i)T/(l  — 

OLi) 


Wy  =  Uy-^L^Wy  (18) 

fc= 1 

Then  |U|  =  n”_1Wjj.  The  complexity  cost  of  this 
operation  is  O  (n3). 

The  single  value  SolvOpt  uses  to  constrain  the  min¬ 
imization  must  be  nonpositive.  Since  we  want  to  con¬ 
strain  the  eigenvalues  of  (15)  to  be  nonnegative  for  all 
k  €  [1, . . . ,  to],  we  simply  find  the  most  negative  of  all 
nk  eigenvalues,  A  min,  and  return  —A  min  as  the  con¬ 
straint. 

To  compute  the  n  eigenvalues  of  each  X;c,  we  follow 
a  two-step  procedure: 

1.  Find  the  Hessenberg  form  of  HR  =  Hess  (X*,) 

2.  Apply  the  QR  transform  to  HR  until  the  eigenval¬ 
ues  are  isolated  on  the  diagonal 

The  Hessenberg  form  of  a  symmetric  matrix  is  tridiago¬ 
nal,  which  simplifies  the  actual  eigenvalue  calculations. 
This  technique  works  because  the  original  matrix  X;,. 
and  its  Hessenberg  form  HR  have  the  same  eigenvalues. 

The  QR  algorithm  iterates  on  HR  until  it  ap¬ 
proaches  the  Shur  normal  form,  which  contains  the 
eigenvalues  on  the  diagonal. 

Each  QR  decomposition  of  HR  results  in  Q,  which 
is  orthogonal,  and  R,  which  is  upper  triangular,  such 
that  QR  =  Hfc.  The  algorithm  proceeds  as  follows 

QR  =  Hks  (19) 

Hfc,s+i  =  RQ  (20) 

for  s  =  0, 1,  2, . . .,  until  HR.  is  in  the  Shur  normal  form. 

As  has  been  discussed,  SolvOpt  evaluates  the  cost 
and  constraint  function  callbacks  to  minimize  |U|  over 
the  n  +  n(n+1')  elements  of  u  and  the  triangle  of  U. 
To  merge  in  estimates,  the  cost  function  performs 
O  (n3)  operations,  the  constraints  function  O  ( mn 3) . 
The  number  of  iterations  which  SolvOpt  must  perform 
varies  widely,  from  1500  to  15000,  depending  on  the 
batch  dimensions  and  also  the  input  data  values.  In 
the  next  section  we  present  results  showing  the  overall 
computational  cost  of  this  approach. 

5  Experimental  Results 

In  this  section  we  present  experimental  results  for 
different  implementations  of  the  CU  algorithm,  using 
SolvOpt ,  written  in  both  Matlab  and  C.  We  have 
timed  the  application  of  CU  on  sets  of  random  data  to 
explore  actual  execution  times  for  various  dimensions 
n,  and  modes  N.  The  times  listed  in  the  following 
tables  were  obtained  on  a  single  1.5  GHz  Pentium 
computer. 


Avg.  execution  times  for  Matlab  (in  secs) 


Dimensions 

2  Modes 

4  Modes 

8  Modes 

16  Modes 

2 

0.91 

1.21 

1.94 

2.22 

4 

22.76 

10.75 

12.78 

21.63 

6 

40.95 

80.58 

55.68 

74.41 

8 

230.50 

204.36 

231.83 

276.55 

Average  execution  times  for  C  (in  seconds) 


Dimensions 

2  Modes 

4  Modes 

8  Modes 

16  Modes 

2 

0.00 

0.00 

0.01 

0.03 

4 

0.43 

0.62 

1.89 

2.73 

6 

2.42 

6.25 

14.18 

30.61 

8 

11.50 

37.05 

63.16 

146.87 

These  results  show  that  the  generality  of  the 
SolvOpt  algorithm  incurs  a  significant  computational 
cost  that  makes  it  impractical  for  most  real-time  appli¬ 
cations  when  the  dimensionality  and  number  of  nodes 
is  high. 

6  Discussion 

In  this  paper  we  have  examined  the  problem  of  rep¬ 
resenting  multimodal  information  using  MHT  and 
GMMs.  We  have  discussed  the  fusion  of  informa¬ 
tion  represented  in  the  form  of  multiple  mean  and  co- 
variance  estimates  corresponding  to  distinct  possible 
states,  or  modes  of  a  distribution,  for  a  tracked  tar¬ 
get.  We  have  discussed  how  the  fusion  operation  re¬ 
sults  in  a  multiplicative  increase  in  the  complexity  of 
the  representation  that  will  grow  exponentially  over 
time  unless  bounded  by  a  mechanism  that  can  com¬ 
press  the  representation  to  a  fixed  number  of  modes. 
We  have  described  ow  Covariance  Union  can  be  used  to 
coalesce  modes  while  preserving  the  rigor  of  the  infor¬ 
mation  management  framework.  Experiments  demon¬ 
strate  the  effectiveness  of  our  approach. 

The  main  result  of  this  paper  is  our  SolvOpt- based 
algorithm,  with  implementations  in  Matlab  and  C,  for 
computing  CU  solutions.  Experimental  results  corrob¬ 
orate  the  correctness  of  the  algorithm,  but  they  also 
show  that  it  is  is  not  practical  for  real-time  applica¬ 
tions.  It  is  expected,  however,  that  our  experimental 
codes  will  provide  the  “gold  standard”  against  which 
faster  approximations  of  CU  can  be  derived. 
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